{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**By 加号**\n",
    "\n",
    "微博：[@翻滚吧\\_加号](https://www.weibo.com/4u2go)\n",
    "\n",
    "这一波的代码其实非常的简单,\n",
    "\n",
    "就是给大家展示一下rule-based的玩法,\n",
    "\n",
    "以及几个角度的升级。\n",
    "\n",
    "首先，我们看一个\n",
    "\n",
    "### 最基础版本的rule-base机器人\n",
    "\n",
    "基本就是小学生级别的 问什么 答什么"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      ">>> fasd\n",
      "I did not understand what you said\n",
      ">>> How are you?\n",
      "I'm fine\n",
      ">>> How are you doing?\n",
      "I'm fine\n",
      ">>> How are you doing?\n",
      "I'm fine\n",
      ">>> How are you doing?\n",
      "I'm fine\n",
      ">>> \n",
      "I did not understand what you said\n",
      ">>> How are you\n",
      "I did not understand what you said\n"
     ]
    }
   ],
   "source": [
    "import random\n",
    "\n",
    "# 打招呼\n",
    "greetings = ['hola', 'hello', 'hi', 'Hi', 'hey!','hey']\n",
    "# 回复打招呼\n",
    "random_greeting = random.choice(greetings)\n",
    "\n",
    "# 对于“你怎么样？”这个问题的回复\n",
    "question = ['How are you?','How are you doing?']\n",
    "# “我很好”\n",
    "responses = ['Okay',\"I'm fine\"]\n",
    "# 随机选一个回\n",
    "random_response = random.choice(responses)\n",
    "\n",
    "# 机器人跑起来\n",
    "while True:\n",
    "    userInput = input(\">>> \")\n",
    "    if userInput in greetings:\n",
    "        print(random_greeting)\n",
    "    elif userInput in question:\n",
    "        print(random_response)\n",
    "    # 除非你说“拜拜”\n",
    "    elif userInput == 'bye':\n",
    "        break\n",
    "    else:\n",
    "        print(\"I did not understand what you said\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 升级I:\n",
    "\n",
    "显然 这样的rule太弱智了，我们需要更好一点的“精准对答”\n",
    "\n",
    "比如\n",
    "\n",
    "透过关键词来判断这句话的意图是什么（intents）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nltk import word_tokenize\n",
    "import random\n",
    "\n",
    "# 打招呼\n",
    "greetings = ['hola', 'hello', 'hi', 'Hi', 'hey!','hey']\n",
    "# 回复打招呼\n",
    "random_greeting = random.choice(greetings)\n",
    "\n",
    "# 对于“假期”的话题关键词\n",
    "question = ['break','holiday','vacation','weekend']\n",
    "# 回复假期话题\n",
    "responses = ['It was nice! I went to Paris',\"Sadly, I just stayed at home\"]\n",
    "# 随机选一个回\n",
    "random_response = random.choice(responses)\n",
    "\n",
    "# 机器人跑起来\n",
    "while True:\n",
    "    userInput = input(\">>> \")\n",
    "    # 清理一下输入，看看都有哪些词\n",
    "    cleaned_input = word_tokenize(userInput)\n",
    "    # 这里，我们比较一下关键词，确定他属于哪个问题\n",
    "    if  not set(cleaned_input).isdisjoint(greetings):\n",
    "        print(random_greeting)\n",
    "    elif not set(cleaned_input).isdisjoint(question):\n",
    "        print(random_response)\n",
    "    # 除非你说“拜拜”\n",
    "    elif userInput == 'bye':\n",
    "        break\n",
    "    else:\n",
    "        print(\"I did not understand what you said\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "大家大概能发现，这依旧是文字层面的“精准对应”。\n",
    "\n",
    "现在主流的研究方向，是做到语义层面的对应。\n",
    "\n",
    "比如，“肚子好饿哦”， “饭点到了”\n",
    "\n",
    "都应该表示的是要吃饭了的意思。\n",
    "\n",
    "在这个层面，就需要用到word vector之类的embedding方法，\n",
    "\n",
    "这部分内容 日后的课上会涉及到。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 升级II：\n",
    "\n",
    "光是会BB还是不行，得有知识体系！才能解决用户的问题。\n",
    "\n",
    "我们可以用各种数据库，建立起一套体系，然后通过搜索的方式，来查找答案。\n",
    "\n",
    "比如，最简单的就是Python自己的graph数据结构来搭建一个“地图”。\n",
    "\n",
    "依据这个地图，我们可以清楚的找寻从一个地方到另一个地方的路径，\n",
    "\n",
    "然后作为回答，反馈给用户。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# 建立一个基于目标行业的database\n",
    "# 比如 这里我们用python自带的graph\n",
    "graph = {'上海': ['苏州', '常州'],\n",
    "         '苏州': ['常州', '镇江'],\n",
    "         '常州': ['镇江'],\n",
    "         '镇江': ['常州'],\n",
    "         '盐城': ['南通'],\n",
    "         '南通': ['常州']}\n",
    "\n",
    "# 明确如何找到从A到B的路径\n",
    "def find_path(start, end, path=[]):\n",
    "    path = path + [start]\n",
    "    if start == end:\n",
    "        return path\n",
    "    if start not in graph:\n",
    "        return None\n",
    "    for node in graph[start]:\n",
    "        if node not in path:\n",
    "            newpath = find_path(node, end, path)\n",
    "            if newpath: return newpath\n",
    "    return None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(find_path('上海', \"镇江\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "同样的构建知识图谱的玩法，\n",
    "\n",
    "也可以使用一些Logic Programming，比如上个世纪学AI的同学都会学的Prolog。\n",
    "\n",
    "或者比如，python版本的prolog：PyKE。\n",
    "\n",
    "他们可以构建一种复杂的逻辑网络，让你方便提取信息，\n",
    "\n",
    "而不至于需要你亲手code所有的信息:\n",
    "\n",
    "\n",
    "    son_of(bruce, thomas, norma)\n",
    "    son_of(fred_a, thomas, norma)\n",
    "    son_of(tim, thomas, norma)\n",
    "    daughter_of(vicki, thomas, norma)\n",
    "    daughter_of(jill, thomas, norma)\n",
    "\n",
    "\n",
    "\n",
    "### 升级III：\n",
    "\n",
    "任何行业，都分个前端后端。\n",
    "\n",
    "AI也不例外。\n",
    "\n",
    "我们这里讲的算法，都是后端跑的。\n",
    "\n",
    "那么， 为了做一个靠谱的前端，很多项目往往也需要一个简单易用，靠谱的前端。\n",
    "\n",
    "比如，这里，利用Google的API，写一个类似钢铁侠Tony的语音小秘书Jarvis：\n",
    "\n",
    "我们先来看一个最简单的说话版本。\n",
    "\n",
    "利用gTTs(Google Text-to-Speech API), 把文本转化为音频。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#可以忽略，我只是为了页面美观\n",
    "import warnings\n",
    "warnings.simplefilter('ignore')\n",
    "\n",
    "from gtts import gTTS\n",
    "import os\n",
    "\n",
    "tts = gTTS(text='您好，我是您的私人助手，我叫小辣椒', lang='zh-tw')\n",
    "tts.save(\"hello.mp3\")\n",
    "os.system(\"mpg321 hello.mp3\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "同理，\n",
    "\n",
    "有了文本到语音的功能，\n",
    "\n",
    "我们还可以运用Google API读出Jarvis的回复：\n",
    "\n",
    "（注意：这里需要你的机器安装几个库 SpeechRecognition, PyAudio 和 PySpeech）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import speech_recognition as sr\n",
    "from time import ctime\n",
    "import time\n",
    "import os\n",
    "from gtts import gTTS\n",
    "import sys\n",
    "from pypinyin import lazy_pinyin\n",
    " \n",
    "# 讲出来AI的话\n",
    "def speak(audioString):\n",
    "    print(audioString)\n",
    "    tts = gTTS(text=audioString, lang='zh-tw')\n",
    "    tts.save(\"audio.mp3\")\n",
    "    os.system(\"mpg321 audio.mp3\")\n",
    "\n",
    "# 录下来你讲的话\n",
    "def recordAudio():\n",
    "    # 用麦克风记录下你的话\n",
    "    r = sr.Recognizer()\n",
    "    with sr.Microphone() as source:\n",
    "        audio = r.listen(source)\n",
    " \n",
    "    # 用Google API转化音频\n",
    "    data = \"\"\n",
    "    try:\n",
    "        # 这里注意，我讲不出台湾腔，为了准确度，使用标准普通话\n",
    "        data = r.recognize_google(audio, language='zh-cn')\n",
    "        print(\"你说：\" + data)\n",
    "    except sr.UnknownValueError:\n",
    "        #print(\"对不起，我不能理解你的话\")\n",
    "        #为了输出不要太多乱七八糟的东西，我这里直接pass\n",
    "        pass\n",
    "    except sr.RequestError as e:\n",
    "        print(\"无法从谷歌服务中找到符合你要求的内容; {0}\".format(e))\n",
    " \n",
    "    return data\n",
    "\n",
    "# 自带的对话技能（rules）\n",
    "def jarvis():\n",
    "    \n",
    "    while True:\n",
    "        \n",
    "        data = recordAudio()\n",
    "        \n",
    "        #不让它什么都回，只回『小辣椒』开头的句子\n",
    "        if \"辣椒\" in data:\n",
    "            speak(\"我在！\")\n",
    "            #再听一句\n",
    "            data = recordAudio()\n",
    "            #问候\n",
    "            if \"你好吗\" in data:\n",
    "                speak(\"我很好\")\n",
    "            #时间\n",
    "            elif \"时间\" in data:\n",
    "                speak(ctime())\n",
    "            #打开网页\n",
    "            elif \"在哪里\" in data:\n",
    "                #把后面三位：『在哪里』去掉\n",
    "                location = data[:-3]\n",
    "                speak(\"别着急宝贝，我给你看看\" + location + \"在哪里\")\n",
    "                #这里有个问题就是，google的url打开不了『上海』这个中文。如果换成pinyin的话就没问题了\n",
    "                #所以我们用PyPinYin\n",
    "                pinyin = lazy_pinyin(location)\n",
    "                pinyin_location = ''\n",
    "                for i in pinyin:\n",
    "                    pinyin_location += i\n",
    "                os.system(\"open -a Safari https://www.google.com/maps/place/\" + pinyin_location + \"/&amp;\")\n",
    "            \n",
    "            #打开word，并输入文字\n",
    "            elif \"编辑\" in data:\n",
    "                text = data[2:]\n",
    "                speak(\"走你\")\n",
    "                os.system(\"python3 ./docx_commands/create_doc.py -t \"+text)\n",
    "                os.system(\"open demo.docx\")\n",
    "            # 调整字体大小\n",
    "            elif \"放大\" in data:\n",
    "                speak(\"行嘞\")\n",
    "                os.system(\"python3 ./docx_commands/larger_font.py\")\n",
    "                os.system(\"open demo.docx\")\n",
    "            # 加粗字体\n",
    "            elif \"加粗\" in data:\n",
    "                speak(\"好的\")\n",
    "                os.system(\"python3 ./docx_commands/bold_text.py\")\n",
    "                os.system(\"open demo.docx\")\n",
    "            # 罗氏理解万岁\n",
    "            elif \"蠢\" in data:\n",
    "                speak(\"理解万岁\")\n",
    "            # 挥一挥手\n",
    "            elif \"再见\" in data:\n",
    "                speak(\"再见\")\n",
    "                break\n",
    "\n",
    "\n",
    "# 跑起\n",
    "jarvis()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "不仅仅是语音前端。\n",
    "\n",
    "包括应用场景：微信，slack，Facebook Messager，等等 都可以把我们的ChatBot给integrate进去。\n",
    "\n",
    "大家有兴趣可以报名我们的课程\n",
    "\n",
    "一点点来学起机器学习与人工智能\n",
    "\n",
    "欧了~"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
